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Abstract — We investigate optimal encoding and retrieval of 
digital data, when the storage/communication medium is de- 
scribed by quantum mechanics. We assume an m-ary alphabet 
with arbitrary prior distribution, and an n-dimensional quantum 
system. Under these constraints, we seek an encoding-retrieval 
setup, comprised of code-states and a quantum measurement, 
which maximizes the probability of correct detection. In our 
development, we consider two cases. In the first, the measurement 
is predefined and we seek the optimal code-states. In the second, 
optimization is performed on both the code-states and the 
measurement. 

We show that one cannot outperform 'pseudo-classical trans- 
mission', in which we transmit n symbols with orthogonal 
code-states, and discard the remaining symbols. However, such 
pseudo-classical transmission is not the only optimum. We fully 
characterize the collection of optimal setups, and briefly discuss 
the links between our findings and applications such as quantum 
key distribution and quantum computing. We conclude with a 
number of results concerning the design under an alternative 
optimality criterion, the worst-case posterior probability, which 
serves as a measure of the retrieval reliability. 

Index Terms — transmitter design, quantum detection, quan- 
tum key distribution, semidefinite programming, bilinear matrix 
inequality. 



I. Introduction 

UNDERLYING any scheme for the storage or transmis- 
sion of information is a physical medium. The encoding 
and the retrieval of information must therefore involve con- 
siderations as to the nature of the medium, with regard to 
possible corruption of the retrieved data, due to interaction 
with the environment or to physical limitations of the medium 
itself. Examples of media and information encoding range 
from letters printed in ink on paper, through electric charge 
stored in a capacitor, to photons travelling through an optical 
fiber. This work is concerned with the encoding of digital 
information in media, whose physics is described by the laws 
of quantum mechanics [1]. 

We concentrate on digital information with a finite alphabet, 
i.e. the data is one of m possible messages, each one associated 
with a prior probability pi. Retrieval of the data is done by 
performing a measurement, thereby detecting the state of the 
system. 

There are several common criteria for the assessment of 
information retrieval, which can, for the most part, be divided 
into two categories. The first is comprised of criteria whose 
motivation stems from information theory (e.g. mutual infor- 
mation [2]). The second type of criteria aim to measure the 
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reliability of the "per symbol" retrieval, without taking into 
account any pre- or post-processing (channel coding). The 
criteria we address in this work are of the second kind. 

The laws of quantum mechanics state that the outcome of 
a measurement, in our case an attempt to retrieve the encoded 
symbol, is random. Thus, a quantum encoding-retrieval setup 
is characterized by the transition probabilities 

Pr{i|j} = Pr{out = symbol z|in = symbol j}. 

This is reminiscent of the more common classical setups, but 
whereas the randomness there is induced by noise from the 
environment, in the quantum case, the randomness is inherent 
in the system itself. 

The state of a quantum system is mathematically repre- 
sented by a unit trace positive semidefinite operator p on an 
n-dimensional Hilbert space Ti. Encoding digital information 
in a quantum system is done by preparing the system in one 
of m predefined states {pi}™ =1 , each associated with one of 
the possible messages. Retrieval is achieved by performing 
a measurement, and determining in which of these predeter- 
mined states the system has been prepared. 

However, if a quantum system is in one of several states 
whose range spaces are not orthogonal, i.e. pipj ^ 0, then no 
measurement permitted in quantum mechanics can determine 
without fail which of the states is present; there is a non-zero 
probability of detection error, i.e. Pr{i|j} > for i ^ j. 
The question is then, what valid quantum measurement would 
yield favorable detection performance. 

A popular measure of performance, and the one which is 
the main interest of this work, is the probability of correct 
detection 



One of the contributions of this work is a complete charac- 
terization of the encoding-retrieval setups which maximize Pd 
under the constraints imposed by the postulates of quantum 
mechanics. We also present several new results concerning 
a different performance measure, the worst-case posterior 
probability, which is defined in Section IVII 

The focus of this paper is the design of a complete digital 
communications channel (or memory unit), in which the 
designer can choose both the code-states pi and the detection 
measurement. We assume that the nature of the data, which is 
designated by the number of possible symbols m and their 
prior probabilities pi, is known. We also assume that the 
dimension n of the quantum system is given. The dimension 
of the quantum system determines the ability of the medium to 
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transmit (or store) data reliably, much like the signal to noise 
ratio in classical systems. 

Thus, we seek the optimal setup, comprised of code-states 
and a measurement, that maximize Pd under a constraint of the 
dimension n of the system. We find the maximum attainable 
value of Pd for data with an arbitrary prior distribution, and 
completely characterize the optimal setups, which achieve this 
value of Pd- 

When the number of symbols m is no larger than the 
dimension of the Hilbert space n, one can simply choose pi 
as orthogonal pure states and attain perfect detection. When 
m > n this is no longer possible, and quantum encoding 
becomes non-trivial. 

Motivation for using many symbols in a quantum system 
of low dimension may stem from benefits, which a protocol 
provides, for which one is willing to sacrifice the probability of 
detection or the information rate. For instance, in protocols of 
quantum key distribution [3] the use of many states enables the 
detection of eavesdropping on the communication. In Section 
IV-CI we elaborate on this point. Another possible scenario 
is a quantum computation, which has a finite number of 
possible outputs m, and where for reasons of implementation 
complexity one cannot create a system large enough (with 
enough qubits such that m < n). 

The problem of distinguishing among a collection of spec- 
ified quantum states, i.e. when the code-states pi are a given, 
is regularly referred to as quantum detection or quantum 
state discrimination, and has been studied in detail. Necessary 
and sufficient conditions for an optimal measurement, which 
maximizes the probability of correct detection Pg, have been 
derived [4], [5], [6]. Explicit solutions to the problem are 
known in some particular cases [7], [8], [9], [10], [11], 
including ensembles obeying a large class of symmetries [12]. 
The optimal measurement can also be calculated numerically, 
to within arbitrary accuracy, and in polynomial complexity [6]. 

Several alternative approaches have also been investigated. 
These include optimization with regard to other performance 
criteria, such as mutual information [4] or the worst-case 
posterior probability [13]. Another approach is unambiguous 
detection [14], [15], [16] in which one allows for an inconclu- 
sive result but does not allow for error. More recently, interest 
has grown in detection in a noisy environment [17], [18], [19], 
and in situations where the states are only partially known [20] 
or the prior probabilities not specified [21]. 

In Section |n] the problem is presented in more detail. 
Then, in Section |lll| we show that the optimal code-states 
for a predetermined measurement are states which lie in the 
eigenspaces of the measurement operators associated with the 
maximal eigenvalues. This result is of interest both in its own 
right, and as part of the design of complete optimal encoding- 
retrieval setups. 

Sections HV1 and M are the heart of this work. In Section HV1 
we show that when encoding digital information in a quantum 
system of dimension n, the maximum attainable probability of 
correct detection may be achieved by simply discarding m — n 
of the symbols and using an orthonormal set to encode the 
remaining n symbols with perfect reconstruction. We dub this 
method pseudo-classical transmission. This is, however, not 



the only possible encoding-retrieval setup which achieves the 
maximal value of Pd- In Section [V] we show that all setups 
that attain the maximum are composed of pure code-states 
and of rank-1 measurement operators, and fully characterize 
the collection of optimal setups. The importance of finding all 
the optimal setups is discussed in Subsection IV-CI where we 
outline possible use of our results in the analysis of quantum 
communication and computation protocols. 

In Section IVII we explore performance in relation to a 
measure of the reliability of the outcome. We introduce the 
worst-case posterior probability, denoted P p . Again, when 
m > n and perfect communication is impossible, the output 
can never be fully reliable. We provide a simple method for 
finding an upper bound on P p for arbitrary states pi and prior 
probabilities pi. 

Regrettably, for a large family of encoding-retrieval setups, 
P p is ill-defined. For this reason, we also define a variation on 
P p that we name the effective worst-case posterior probability. 
We investigate how one should choose the code-states, which 
represent discarded symbols in pseudo-classical transmission, 
in order to increase the reliability of the output, while still 
attaining maximal Pd- We develop an upper bound on P° s , 
and present a choice which attains it. 

II. Problem Formulation 

A. Notation 

According to the postulates of quantum mechanics [1], 
a physical system is mathematically represented by an n- 
dimensional complex Hilbert space H. The state of the system 
p is represented by a positive semidefinite (PSD) Hermitian 
operator on Ti, such that Tr(p) = 1. Throughout, we shall 
use the notation A > to indicate that an operator A is PSD, 
and the notation A > B to imply that A - B is PSD. If 
rank(/9) = 1, then it is known as a pure state. 

As is customary in work relating to quantum theory, we 
shall use Dirac's notation of linear algebra, wherein a vector 
is denoted by \u), its Hermitian conjugate by (u\, and inner and 
outer products are signified by (u\v) and |it)(w| respectively. 
We do not assume that \u) is normalized. We denote by TZ(A) 
the range space of a Hermitian operator A, and by M. (A) the 
eigenspace of its maximal eigenvalue. 

B. Encoding Data in Quantum Media 

We wish to encode digital information in a quantum 
medium. The information is represented by an m-ary alphabet, 
where each symbol has a prior probability pi. Without loss 
of generality, we assume that the prior distribution obeys 
Pi > P2 > • • • > Pm > 0. The encoding is achieved by 
associating with each symbol a predefined quantum state pi, 
and preparing the system in the appropriate state. We shall 
refer to the states pi as code-states. To a set of code-states 
{Pi}l=i we refer as an ensemble. Whenever an ensemble is 
arbitrary, we assume that it spans 1 7i. 

'if it does not span H, the problem can always be projected onto the 
subspace which it spans. 
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Retrieval of the information is accomplished by using a 
positive operator valued measurement (POVM), which is a 
set of m operators II = {IT}™ which satisfy 



n, > o, 



1 < i < m 



E 



n, : = I. 



This is the most general type of measurement allowed by the 
laws of quantum physics. 

The measurement results in one of m possible outcomes, 
where, given that the state of the system is p, the probability 
of the i-th outcome is 

Pr{i} = Tr(U iP ). 

Thus, the probability of correctly detecting the encoded mes- 
sage is 



p d = p d (Hi, Pi ) 



^jJiTrfTTpi) 

»=l 



In this work we use Pd as the main criterion for measuring 
the quality of an encoding-retrieval setup. 

In the next section we find the optimal code-states, in the 
sense of maximal P d , for a given measurement. We then 
characterize, in Sections I1VI and [V] all optimal encoding- 
retrieval setups, when the design specifications are the nature 
of the data (the prior probabilities p^, and the dimension n of 
the quantum system. 

In Section IVII we develop several results concerning an 
alternative measure of performance, the worst-case posterior 
probability. This criterion is an indicator of the reliability of 
the output, and is defined at the beginning of Section Ivf! 

III. Designing Code-States for an Arbitrary 
Measurement 

In this section we answer the following question. If the 
detector, i.e. the measurement II, and the prior probabilities 
of the data pi are predetermined, what would be a good choice 
of code-states pi to encode the data in a quantum medium of 
dimension n, in terms of Pjl This question is of interest, 
due to possible implementation restrictions on the detector. 
As indicated in the introduction, the reverse situation, that 
of designing a measurement to discriminate among arbitrary 
states, has been thoroughly studied. 

Our result is stated formally in Theorem [2 

Theorem 1: Let {j^}™ 1 be a probability distribution, and 
let {IT}™ 1 be the measurement operators of a detector. An 
ensemble of quantum states {pi}™ =1 maximizes Pd if and only 
if 

n{ Pi ) C M(ILi). 

Denoting the maximal eigenvalue of IT the maximal 

probability of correct detection is given by 



popt _ \ ~» max 



Note that for all i such that IT = 0, one has that M (IT) 
TL, and any choice of pi is optimal. 



Proof: The optimal states pi are a solution to 

m 

max y piTr(!Iipi) (1) 



s.t. 



Pi > 0, 



^Tr( Pl ) = 1. 

The objective function in Q is additive in the variables pi, 
and the constraints on each of the pi are independent. Hence, 
is separable in i, i.e. the states pi are optimal if and only 
if they are also the solutions to m problems of the form (one 
for each i) 



maxTr(II ( o) 



(2) 



s.t. 



\Tr(p) = 1. 



Any quantum state p, such that p > and Tr(p) = 1, has 
an eigendecomposition of the form 



= ^29j\uj){uj\, 



where gj > 0, Y^=i 9j — 1> an d 
we have that 



= 1. Since II > 0, 



Tr(ITp) = V^IITju, 



< (u\U\u) > gj 



3=1 



= (fi|n|fi) 

< a 



n > 

where (u|II|u) = maxj(uj|n|wj), and crj| lax is the largest 
eigenvalue of IT. If II = then the upper bound is zero and 
any p > is optimal. When IT ^ 0, equality is achieved 
if Ti^ITp) = er^ ax , i.e. only when p lies in the eigenspace 
corresponding to <jjj ax . ■ 

Note that the optimal code-states p$ are independent of 
each other and of the prior probabilities pi. Also note that 
the optima (the solutions of the problem Q) form a convex 
set. 

Corollary 1.1: If for all i, dimM.(Tli) — 1. then the 
ensemble which maximizes Pd is unique. 

Proof: When dim.M(IL;) = 1 then pi must be the pure 
state which spans _M(II;), and which is unique (due to the 
requirement of normalization). If this is true for all i, then the 
entire set of code-states is unique. ■ 

In applications, one may have the freedom to choose which 
symbol will be detected by which of the detection operators. 
Recalling that we assumed the prior probabilities pi to be 
sorted in descending order, maximal Pd can be attained when 
the detection operators are sorted such that cr™ K > o-n 2 aX > 
■ • ■ > <7^ lax . Doing this, and selecting the optimal code- 
states as above, would lead to the maximal value of P d = 

E n -max 
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IV. Optimal Quantum Encoding 

We now find the maximal attainable value of Pd when en- 
coding data in a quantum medium. We assume that the nature 
of the data itself, which is manifested in the prior probabilities 
Pi, is predetermined, and so is the quantum system itself (i.e. 
the dimension n). We aim to find an encoding-retrieval setup 
that maximizes Pd- 

Thus, our goal is to find the solutions to 



probability of correct detection Pd- We now formulate and 
prove a theorem which shows this to be impossible. 

Theorem 2: Let {pi}™ 1 be a probability distribution with 
Pi > P2 > • • ■ > Pm > 0. Denoting by Pd the maximal 
probability of correct detection for a quantum system of 
dimension n < m, we have that 



maxy^piTrQliPi) 

Pi > 0, TrOi) = 1, 

rn 

IT > o, Yl n * = L 



(3) 



s.t. < 



This optimization problem is of a class known as Bilinear 
Matrix Inequality (BMI) optimization problems [22]. BMIs 
are non-convex, and in general, finding a global optimum is 
an NP-hard problem [23]. Nonetheless, for this particular BMI 
0, we are able to formulate a closed form solution, and to 
completely specify the optimal set. 

When the dimension n of the quantum system is equal to 
the number of possible messages m, then perfect retrieval 
(Pd = 1) is achievable by choosing the code-states pi to be 
mutually orthogonal pure states, and the measurement such 
that LL; = pi. When n < m this is no longer possible. 
The most straightforward approach to quantum encoding when 
n < m is to simply disregard m — n of the messages and aim 
to perfectly retrieve the remaining n messages. It is clear that 
the smallest probability of error would occur if the disregarded 
messages were the ones with smallest prior probabilities. Thus, 
this approach is embodied in the ensemble-detector setup 



IT 



\ui){ui\ 1 < i < n 

n < i < m 

\ui)(ui\ 1 < i < n 

Don't care n < i < m 



(4) 



where {|iti)}"=i is some orthonormal system. When using this 
setup P d = Y% =1 Pi. 

The distinction between classical and quantum systems is 
very strongly linked to the fact that non-orthogonality between 
two quantum states affects the ability to distinguish between 
them. There is no classical analogue of this property. When 
the states that a quantum system may be in are mutually 
orthogonal, it is said to be in "the classical limit". The 
fact that the setup © is comprised only of pure mutually 
orthogonal states implies that it is classical in nature and that 
the losses encountered are not due to the fact that the system is 
governed by quantum mechanics, but to a lossy preprocessing 
(disregarding some of the messages). In the sequel we refer 
to 10} as pseudo-classical transmission. 

It would, at first glance, seem that one may somehow be able 
to utilize the "quantumness" of the system, i.e. non-orthogonal 
code-states and measurements, in order to improve on the 



Proof: Let Pd — 2~27=iPi' Since the pseudo-classical 
setup © achieves P<j(IIi, pi) = Pd, we have that Pd > Pd- 
We prove the theorem by showing that Pd < Pd- 

The maximal value of Pd is the solution of 0. From 
Theorem [2 after maximizing with respect to p. L , reduces 
to 



' i=l 



(5) 



s.t. 



IT > 0, 



5>=j. 



(a) 
(b) 



The constraint (|5t) implies that 
and from l^) 



<in ax > 0, 1 < » < m, 



in ax ^ 1 

CT n ; < L 



1 < i < m, 



(6) 



(7) 



(The bottom expression in (0 is obtained by taking the trace 
of (J5J})). We now replace (|5j by a scalar program, 



maxy^ piOi 

fT.' * 4 



(8) 



s.t. < 



< o-i < 1, 



V i=l 



Gi < n. 



Problem (|8j was created by relaxing the constraints of problem 
(|5} - we keep only the constraints on the eigenvalues and 
disregard the original matrix-inequality constraints. Therefore, 
the solution of (|8} is always larger or equal to the solution of 
(|5}, and thus, serves as an upper bound. 

The optimization problem (|8} is a linear programme. Its 
Lagrange dual problem [24] is given by 



min g(r}i,p) 



(9) 



s.t. 



[Pi ~ Vi + Vi - M = 0, 
where 1 < i < m and 



(a) 
(b) 



g{rii,P) 



i=l 



Vi + n P> 
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Using the constraint (|9p), the variables vi can be eliminated, 
yielding 



mmg(r)i,iJ,) 



(10) 



s.t. 



Vhf 1 > °> 

m + m > Pi 



From Lagrange duality theory, for any point in the feasibility 
set of ( I lOt . the objective g(r)i,[i) is greater or equal to the 
solution of the primal problem 0. In other words, for any 
dual feasible point (77, ju), g{rji,p,) is an upper bound on the 
solution of 0- Consider 



Vi 

A = Pn+l 



Pi - Pn+i 1 < i < n 
n < i < m 



(11) 



Because p\ > • • • > p m it is dual feasible. For this choice 

in n n 

gifihfr) = ^2vi + nfi = ^2(fji + A) = ^Pi- 



i=l i=l 

In conclusion, we have shown that 



max 



(J3J = max (|5jl < max (JSJl < min dlOi < , 



j=i 



which implies that for any valid ensemble and detector Pd = 

YZiPMRiPi) < Pd. ■ 

The implication of Theorem |2] is that one can achieve the 
optimal probability of correct detection by using orthogonal 
pure states and von Neumann measurements, which are easy 
to implement. Nevertheless, there may be setups {pi> Ili}^! 
other then @ which attain Pd(IL;,pi) = Pd- In the next 
section we identify all the ensemble-detector setups which 
achieve maximum probability of correct detection. The im- 
portance of characterizing the set of optima is that we may 
be able to select an optimum that has preferable performance 
with regard to other quality of service measures. Also, there 
may be communication protocols which require using a "non- 
classical" ensemble. These aspects are discussed in greater 
detail in Section IV^Cl 

V. Characterization of Optimal Setups 

In this section we introduce the notion of tight frame 
encoding setups, and show that all optima are of this form 
(Theorem 0. We then fully characterize the set of optima 
for a given prior probability distribution (Theorem |4] and 
corollaries). 

A. Tight Frame Encoding Setups 

A tight frame [25] is a set of m vectors {litj)}^ which 
satisfy 



^2\Ui){Ui\ =1. 



We define a "Tight Frame Encoding Setup" (TFES) to be an 
ensemble-detector setup of the form 



IT = \ui)(ui\, 



p . = } («i|««> 



i)(Ui\ 



Don't care, 



(ui\ui) > 
{ui\ui} = 



where the vectors obey (I12> . The pseudo-classical setup 
(|4} is an example of a TFES. The probability of correct 
detection when using a TFES is P^ = Y^iLiPi( u i\ u i)- 

The constraint (11 2t on the vectors ensures that IT is a valid 
POVM. It also implies several properties of the vectors 
which are summarized in the following lemma. 

Lemma 1: Let be a set of vectors which satisfy 

O- Then, 



(ui\m) < 1, 



if (ui\ui) = 1 then (v,i\uj) = Si.j, 



(13) 
(14) 
(15) 



y^iK) = n. 

i=l 

Proof: See Appendix lAl ■ 

Tight frames are of interest in many fields and applications 
where one seeks a set of vectors whose mutual "interference" 
is minimal. Specifically, in classical communication, they play 
an important role in Syncronous CDMA systems [26], [27]. 
Also, the simplex constellation, which is known to be optimal 
under certain energy constraints [28], [29], is a tight frame. 

The significance of TFESs to quantum encoding is estab- 
lished by the following result: 

Theorem 3: All ensemble-detector setups {pi,ili}^i 
which achieve Pditi-i, pi) = Pd are TFESs. 

The proof of Theorem [5] relies on the following lemma, 
whose proof is given in Appendix 151 

Lemma 2: For any ensemble-detector setup (A, IT) which 
achieves Pd(tli,pi) = Pd, the largest eigenvalues of the 
detection operators satisfy 



n. 



(12) 



Proof: (of Thm. [3} For any POVM, we have that 

Tr(IL) > <f x (16) 

m 

]rTr(IL)=n (17) 

where ( 1171 comes from taking the trace of the requirement 

Assume that an ensemble-detector setup {/5i,IL;} achieves 
Pd- Using dl7> with Lemma |2] we get that 

m rn 

]TTr(n 4 ) = 5>^ x , 
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which, in conjunction with ilbt , shows that for any such 
detector 

Tr(f[i) = o-™ ax . 1 < i < m 
This in turn implies that 

rank(IIj) < 1, 1 < i < m 

i.e. the detection elements of any detector which is part of an 
optimal setup are of the form 11; = (where may 

also be the null vector). In order for II to be a valid POVM, 
the set of vectors { | w^) must obey ill 2b . 

Since { p j , II j } is assumed to be an optimal setup, then pi 
must be an optimal ensemble for the detector II. Thus, from 
Theorem^we have that for any i, such that (ui\ui) > 0, 

1 , , 



\Ui\Ui) 



If (ui\Ui) = 0, then pi can be any quantum state. ■ 

An interesting aspect of the above result is that Pd can only 
be attained by setups in which the detected code-states (those 
for which the corresponding measurement operator is not zero) 
are pure states. This is hardly surprising, since obviously, for 
mixed states the chances of "interference" between code-states 
are greater. 



B. Choice of TFES 

From Theorem [5] we know that all optima are TFESs. Not 
all choices of TFES are, however, necessarily optimal. We now 
show that the set of optimal TFESs is dependent on the prior 
probabilities {ft}™ i, and on the dimension of the quantum 
medium n, and characterize this dependance. The following 
results (Theorem |4] and corollaries) fully characterize all 
optimal solutions for a given prior distribution and dimension 
n. 

In order to formulate our results we introduce a classifica- 
tion of the symbols into three distinct subsets, according to the 
prior probability distribution and the dimension n. Recalling 
that we assume p\ > ft > ■ ■ ■ > ftn, we define 

* T\ = {i \ pi > p n }, 

• X 2 = {i\pi =Pn}, 
« X 3 = {i \pi < p n }. 

Note that X\ and Z3 may be empty. 

Theorem 4: Let {ft}™ 1 be a non-increasing distribution of 
probabilities, and let be the vectors of a TFES in a 

quantum system of dimension n. This TFES is optimal in the 
sense of probability of correct detection if and only if (i) for 

all i € Zi, (ui\ui) — 1, and (ii) for all i £ Z3, (ui\ui) = 0. 

Before proving Theorem @] we point out the following 
important corollaries: 

Corollary 4.1: Let {ft}™ 1 be a non-increasing distribution 
of probabilities, and let {/>,, flj}|^ =1 be an optimal encoding 
setup in a quantum system of dimension n. Then 

1) Pv{j\i} = 5 itj i e X u 

2) Pr{det i} = i G 1 3 . 



Proof: From Theorem |31 we know that the ensemble- 
detector setup is a TFES. From Theorem @] for all i G X\, 
fli = |ttj)(ui|, such that (ui\ui) = 1. Together with (114-1 . it is 
easy to see that 



Pr{j|i} = Trfapi) = 



Also from Theorem^] for all i S I3, 11; =0, indicating that 
the probability of detecting the i-th message is Pr{det i} = 
£,ftTr(n^.) = 0. ' ^ ■ 

Corollary 4.2: Let {pi}™ =1 be a non-increasing distribu- 
tion of probabilities. If p n > p n +i then any optimal setup 
{/5i,IIi}™ 1 must be of the form @ (pseudo-classical). 

Proof: From Theorem [3] the optimal setup must be a 
TFES. When p„ > p n +i we have X 3 = {n + l,...,m}, 
which, using Theorem 0] indicates that 

(ui\ui) = n + 1 < i < m. 

Together with d!2i this implies 



^2\Ui){Ui\ =1. 



(18) 



A set of n vectors in ri-dimensional space can satisfy (II 81 
if and only if they form an orthonormal set. Thus, the only 
optimal setup when p n > p n +i is (EJ. ■ 

Corollary 4.3: If ft = ^ for all i, then all TFESs achieve 
Pd- 

Proof: The Corollary follows directly from Theorem 0] 
for Xi = 1 3 = 0. ■ 

We now prove Theorem 

Proof: (of Thm.|4} Assume that {liti)}™^ are the vectors 
of a TFES which is optimal in the sense of Pd- 

Assume that Xi ^ and denote by k the largest index in X\. 
(i.e. Zi = {1, . . . , k}). This means that p k > p k+1 = p k+2 

■ ■ p n (from the definition of X\, we have that k < n). For 
any TFES we can write 

m 

Pd = y^^Pi(ui\ui} 

1=1 

k m 



^2pi(ui\ui) + ^2 Pi( u i\ u i) 



i=l 
k 



i=k+l 



<^ft(Wj|Ui) + Pk+1 ^ ( U i\ U i) 



i=l 
k 



i = k+l 



^ft(w 4 |ui) +Pk+i n - y^ y (ui\ui) 



(19) 
(20) 



i=l 
k 



= ^2 [Pi(ui\ui) + p k +i{l - (Ui\ui))] +(n — k)pk+i, 

i=l 

(21) 

where the transition from dl9l l to i2Q\ relies on J15I . 
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Recall that for all i £ X\ we have pi > Pk+i- If for some 
1 < i < k, (ui\ui) < 1, then from (12 U 



Pd < ^ [Pi( U i\ U i) +PiO- ~ (Ui\ui))] + (n - k)p k+ i 
i=l 

k n 



Here we have relied on the fact that Pk+i = Pk+2 = ■ ■ ■ = Pn- 
Therefore, in order to achieve Pd, the vectors \ui) must satisfy 

(ui\ui) = 1, ie T\ 

This concludes the proof of the first statement of the 'only if 
direction. 

We go on to prove the second statement. Assume that 
k' = maxl 2 < to (i.e. 1 3 = {k' + 1, . . . , m} ^ 0). By 
definition py > Pk'+i- We again have 

m k' m 

Pd = y^;Pi(ui\uj) = y^jPi(ui\uj) + Pi(ui\ui). 

i—l i—1 i—k' + l 

If for some i £ I3, (ui\ui) > 0, then 

k' m 

p<i < y^Pi(^K) +Pk' (ui\ui) (22) 

i=l i=fc' + l 

n m 

= ^Pi(Ui\ui) +p n ^ ( U i\ U i) ( 23 ) 
i—l i—n-\-l 
n / n \ 

= y^Pt(tttlttj) +p« n - y^(tt<K) (24) 

i=l \ i=l / 

n n 

= ^pi(ui\ui) +p n y^ (1 - (wiiwi)) 

n n 
n 



where the transitions from i22\ to (I24> rely on the fact that 
k' £ T2 and on ( I15> . Thus, for any TFES which achieves 
maximal Pd 

(Ui\ui) = 0, i£ I3. 

We continue by proving the 'if direction. Assume that for 
all i £ Ti, (ui\Ui) = 1, and that for all i £ I3, (v,i\ui) = 0. We 
must first note that under these conditions, using d!5l > yields 



k' 

E 

i=l 



If Ji = 0, then T % = {1, • • • j &'}■ We can then write 

m 

Pd = ^Pi(ui\ui) (26) 
i=i 

fc' 

i=i 

n 

= 7^!=^^=^, (28) 

1=1 

where the transition from i26\ to J27i relies on the facts that 
for all i > fc', (ui\Ui) = 0, and p\ = p-2 = ■ ■ ■ = p n . The 
transition from j27t to J2 81 is based on ( I25> . 

If Xi = {1, . . . , k} and I2 = {k + 1, . . . . fc'} then, similarly, 

m 

-Pd = }^Pi(Ui\Uj) 



i=l 
k 



= ^2pi(ui\ui) +p k +i ^2 < 



Ui\Ui) 



1=1 

k 



i=k+l 



(ui\ui) = n. 



(25) 



= S ^Pi + {n- k)p k +i =^Pt = Pd, 
i=i i=i 

thereby completing the proof. ■ 

Theorem [5] below summarizes the assertions of Theorems 
13 [5] and |4] in concise form, and completely characterizes all 
optimal transmitter-receiver setups. 

Theorem 5: Let 1 be a probability distribution with 

Pi > P2 > ■ ■ • > Pm > 0. For a given number n < to, define 
the index sets 

« 2i = {z I Pi > _p„}, 

« 1 2 = {i\Pi = Pn}, 
« 2 3 = {i \pi < p n }. 

The maximal probability of correct detection for a quantum 
system of dimension n < m is 

n 
i=l 

The optimum is achieved if and only if the ensemble-detector 
setup is of the form 

n 2 = \ui)(ui\, 

[Don't care, (ui|tti) = 

where the vectors {Jiti)}^! obey 

m 
i=l 

= 1, i e T\ 

(ui\ui) =0. i £ 1 3 
Put in words, maximum Pd can only be attained by a TFES, 
where the messages with high prior probabilities (i £l\) are 
encoded using orthogonal code states, and are thus recovered 
perfectly (Corollary 14. U . and the messages with low prior 
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probabilities (i e X3) are discarded - much like in pseudo- 
classical encoding. In choosing the remaining frame vectors, 
one has freedom and they can be chosen to be non-orthogonal. 
Important special cases are when p n +i > Pn, where one has no 
freedom and the only optimum is pseudo-classical encoding 
(Corollary I4.2L and the equiprobable case pi = — , where 
there is complete freedom in choosing the TFES frame vectors 
(•Corollary 1431. 

C. Application to the Analysis of Communication Protocols 

In many applications, additional constraints, other then the 
ones imposed by the physics, are placed on the encoding- 
retrieval setup. In quantum key distribution [3], for exam- 
ple, constraints arise due to the need for security against 
eavesdropping. Further constraints may occur due to technical 
(implementation) issues. The work at hand can then serve for 
two purposes. The first is to quantify the degradation in Pd 
due to the need to meet the extra design constraints. This 
can be done by simply comparing the performance of the 
constrained system to the theoretical upper bound Pd- The 
second possible use of this work, in this context, is to search 
within the set of optimal TFESs for a setup, which is close to 
meeting the demands posed by the application. When taking 
the latter approach we are assured optimal performance with 
regard to Pd- 

Consider the BB84 protocol [30]. In this QKD protocol, 
Alice wishes to send Bob secure binary information. In order 
to counter possible eavesdropping, she sends one of m = 4 
messages with Pi = \ over a 2-dimensional quantum channel. 
The code-states used are denoted \uij), where i,j = 0, 1, and 
they obey the relations 



(Uij \Ui'j' 



5 jtj > i = i' 
1/2 i i' 



I 1 

i,j=0 



(29) 



Note that ( I29> indicates that this collection of vectors is a tight 
frame. 

Bob utilizes the POVM (of order 4) Ily = M u ij){ u ijl 
in order to retrieve Alice's message. They then exchange 
knowledge on which "pair of states" was received (by, for 
example, comparing the i index). If both the sent and the 
detected symbols originate from the same pair, then the 
transferred bit of information is taken as the member of the 
pair that was detected (the j index). If the symbols originate 
from different pairs, the received symbol is discarded. In order 
to promote security, Alice and Bob use m > n, at a cost 
of reduced data rate. The security of this protocol has been 
extensively studied. 

The probability of correct detection achieved by Bob prior 
to the exchange of the i index is Pd = 1/2. This is equal 
to the upper bound Pd for this case, meaning that under the 
requirement of countering eavesdropping, Bob achieves the 
maximal possible performance. The fact that the upper bound 
is reached would hardly surprise most readers, in the context 
of a protocol as simple as BB84. It does, however, serve to 



illustrate the possible use of the unconstrained upper bound Pd 
in quantifying the efficacy of more complex communication 
protocols. 

VI. Optimal Worst-Case Posterior Probability 

An alternative quality of service measure for systems 
of digital communication/storage is the worst-case posterior 
probability [31], [13]. The posterior probability, defined as 

Pr{message i detected correctly} 



P P (i) 



Prjmessage i detected} 



E 7 -^ Tr (ITiPj)' 



(30) 



is the answer to the question: "Given that the detected message 
is i, what is the probability that it is the right answer?". The 
worst-case posterior probability is then 

P p = min P p (i)- 

i—l,...,m 

The higher the value of P p , the more reliable the output of 
the measurement. 

Denote Prjdet i} — YljPj^i^-iPj) me probability of 
detecting the i-th outcome. By definition 

P p ■ Pr{det i} < p % Pr{i|i}. 1 < i < m (31) 

Summing the inequalities ( 13 U over i, one gets 

m 
i=l 

Thus, in any digital encoding system (not necessarily quantum 
mechanical) the value of P p is bounded above by the value 
of Pd- In particular, for quantum systems, this means that a 
universal upper bound on P p is Pd- Theorem|5]below provides 
a simple method for finding an upper bound on P p for a given 
set of code-states pi and prior probabilities p,. We present an 
example in which our bound is tighter than the universal bound 
Pd- 

Obtaining the optimal measurement in the sense of P p 
involves a bisection procedure, where each step is computa- 
tionally expensive (solving an SDP) [13]. The bound obtained 
using our method can serve to shorten the initial bisection 
interval, thereby reducing the computational cost of finding 
the optimal detector. We also hope that our method can serve 
to find tighter universal upper bounds on P p . 

Note that for the pseudo-classical TFES 10}, and in fact for 
any setup in which one of the POVM elements is zero, the 
posterior probability is ill-defined, since the denominator in 
(I30> is zero. We therefore introduce a surrogate measure of 
the reliability of the outcome, designed to replace P p in this 
case. 

Since we seek a measure of reliability of the output, there 
is no point in taking into account outputs which never occur. 
Hence we choose to measure the most unreliable outcome, 
of the set of possible outcomes. The effective worst-case 
posterior probability is defined as 



peff 

p 



min Pp(i) 

Pr{det i}>0 
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Note that whenever P p is well defined P° ff = P p . In addition, 
the upper bound Pd also holds for P p . 

According to Theorem [5j in many cases, ensemble-detector 
setups which attain optimal Pd are a TFES, whose detector 
has zero elements. For all i, such that IT = 0, we can choose 
the code states freely, without degrading the performance in 
Pd- This raises the question, how should one choose the 'don't 
care' states, so that the output of the system would be reliable? 
We show that for the pseudo-classical TFES (0}, there is a 
choice of 'don't care' states which attains the maximum value 
of Pf. 

A. An Upper Bound on P p for a Given Ensemble 

Theorem 6: Let {pi}™ 1 be m arbitrary quantum states of 
dimension n, with prior probabilities pi. Define the operators 

m 

Ai(5) = (1 - 6)22pkPk - Pipi, 
fe=i 

where 6 € TZ. If for some 1 < i < m, Ai(S) > 0, then 
P p < 1 - 8. 

Proof: Assume that {pt}™ =1 is an arbitrary ensemble of 
quantum states with prior probabilities pi. Denote by II and s 
the solution to 



mm s 



(32) 



{IT >0, 
rn 
X> = / 
Tr[n,A 4 ((5)] < s, l<i<m 

In [13] it was shown that if the value of (I32> is non-negative 
(i.e. s > 0) for a specific choice of 5, then 2 P p (II) < 1 — 5. 
This statement contains a slight inaccuracy, because for P p (II) 
to be well-defined, one must also include in i32\ the constraint 
IT 7^ (the authors do mention 'taking a short cut'). 
The dual program of d3*2l is 

maxTr(y) 

' Ai > 0, 

m 

s.t. < = 1, 

i=l 

^XiAi(S) — F > 0. 

This means that if, for a specific value of 5, one can find real 
scalars A,; and an operator Y, such that 

Ai > 0, 

m 



XiAi(S)-Y>0, 



(33) 



then Tr(Y) < s. Therefore, if in addition to the requirements 
d33l . Y also satisfies Tr(F) > 0, then we are assured that 

2 Actually, the authors of [13] are concerned with an error function which 
is equal to 1 — P p . 



s > 0, and that 1 — <5 is an upper bound on the optimal posterior 
probability P p . 

Define the index subset 

Q(S) = {i\A i (S)>0}, 

and denote its cardinality by |Q(<S)|. If Q(S) is non-empty, 
then we can choose 



Y = 0, 



0. 



i i Q(S) 



TQM' ieQ(S) 



which satisfy all the above requirements d33i . Thus, whenever 
Q(5) is non-empty, 1 — 6 is an upper bound on P p . ■ 

As an example of the application of Theorem[6] we examine 
an ensemble comprised of the pure states = \ui)(ui\ in a 
two dimensional Hilbert space, 



Ui = 



1 



u 2 = 



U3 



Kf 



M ~* 2 V J 
with prior probabilities 

Pi = 0.4 p 2 = P3 = 0.3 

For this ensemble, 

a fx\ - 1 f 9 ~ 186 ^ 
2( j ~4H %/27 19-22* 

whose eigenvalues are 



cta 2 - ^(7- 10(5 ± + 

This implies that A%{8) is PSD for any 5 < 0.36, and thus the 
upper bound provided by Theorem |6] is P p < 0.64. This is an 
improvement over the universal bound Pd = 0.7. 



B. Choosing the 'Don't Care' States of Optimal TFESs 

In many situations, setups which attain maximum Pd, have 
n, = for some i. When this is the case, there are undecided 
degrees of freedom to the TFES - the 'don't care' states. We 
would like to be able to choose these states so that the outcome 
of the measurement is reliable. We measure the reliability 
using P p s defined above. 

We present a choice of 'don't care' states for the pseudo- 
classical setup for which P p s = Pd, i.e. when the pseudo- 
classical setup is used with this choice of 'don't care' states, 
its performance is optimal both in terms of Pd and in terms 
of Pf. 

Theorem 7: When using the pseudo-classical TFES (0}, 
with the choice 

1 " 

Pj = ^2Pi\ui)(ui\, j = n + l,...,m 

for the 'don't care' states, P p s attains the upper bound Pd- 



SUBMITTED TO IEEE TRANSACTIONS ON INFORMATION THEORY 



10 



Proof: For alii < n we get 

P P (i) 



making 



Pi 



P» 

En 
= i=lPi 

En . sr~^m 

i=lVi + 2^j=n+l^J 

n 

= £* 

ThusP^ =£S=iPi=-P* ■ 

VII. Conclusion 

We have addressed the question of retrieval of digital data 
encoded in a quantum medium, using as our main perfor- 
mance criterion the probability of correct detection. We have 
found the optimal code-states for an arbitrary detector, and 
the optimal encoding-retrieval setups for an arbitrary prior 
distribution. 

In terms of Pd one cannot do better then pseudo-classical 
transmission (orthonormal code-states and measurement oper- 
ators). We have also shown that of all the setups which attain 
maximal Pd, the pseudo-classical TFES can be made to have 
optimal effective worst-case posterior probability. We have, 
however, indicated that under certain circumstances, there are 
benefits for using fully quantum setups (non-orthogonal code- 
states). 

The natural extension of this work is the design of optimal 
setups with added constraints. Such constraints may arise due 
to requirements other than reliable communication, such as 
the need for security discussed above. Constraints may also 
stem from implementation issues which are typical to specific 
quantum systems that regularly serve for transmission and 
storage of information. 
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Appendix A 
Proof of Lemma[0 

For all 1 < i < m we have that 

m m 
\{Ui\ui)\ 2 < ^ \( u i\ u k)\ 2 = (Ui\ £ \ u k){uk\Ui). 



k=l 



k=l 



Using \\2\ . this implies that 

\{Ui\Ui)\ 2 < (Ui\l\ui) = (Ui\ui). 

Thereupon (ui\ui) < 1, proving the property (II 31 . 

If (ui\Ui) = 1 then 

m m 

(ui\ui) = (ui\ ^ \uk){uk\ui) = ^2 \(ui\u k )\ 2 = 1 



k^i 



Since this is a sum of nonnegative numbers, then for all k ^ i 
we have 



\{ Ui \u k )\ 2 =0 



(ui\u k ) = 



proving J14t . Property H5\ follows from taking the trace of 
C3- 

Appendix B 
Proof of Lemma|2] 

Assume that (fji , Ji) is a feasible point of the programme 
(II 01 . such that ji — 0. From the constraint (110b ). fji must 
satisfy fji > pi and then 



m n 



g(m,p) >^2pi>^2pi = .9(%,A), 

i=l i=l 

where (fji, p) are defined in (II II . Thus (fji, jl) cannot be a dual 
optimal point. All dual optimal points must satisfy /i ^ 0. 
One of the KKT conditions for the solution to problem (|8) 

is 



Since the dual optimal /i ^ 0, then any optimal values of ai 
must satisfy. 



(B.l) 



i=l 



Let {IF,} be a POVM, which is part of an optimal ensemble- 
detector setup, i.e. £. Picr 1 ^ ax = Pd- By choosing 



0% = (T u 



(B.2) 



fe=i 



fe=i 



we get J^i Pi&i — Pd, ensuring that &i are an optimum of (|S}, 
and thus satisfy ( IB. 11 . In conjunction with ( IB. 21 . this proves 
the Lemma. 
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